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Abstract. The 3MdB (Mexican Million Models database) is a large database of photoion- 
ization models for H II regions. The number of free parameters for the models is close to 
15, including the description of the ionizing Spectral Energy Distribution (effective temper- 
ature, luminosity, surface gravity, for different type of stellar atmosphere models) and the 
description of the ionized gas (distance to the ionizing source, density, abundances of the 
most common elements, dust). The outputs of the models are more than 70 emission line in- 
tensities, the ionic fractions and temperatures. All the parameters and outputs are included 
in the MySQL database, giving the possibility to the user to search into the database for 
example for all the models that reproduce a given set of observations. 
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1. Introduction 

The study of the ionized interstellar medium 
(in the present case I consider only H II re- 
gions) is mainly based on the analysis of the 
observed emission line intensities. From line 
ratios one may determine physical and chem- 
ical parameters of the nebulae such as the elec- 
tron temperature, the electron density and the 
abundances of the most common elements. 
The characteristics of the ionizing spectrum 
(effective temperature, luminosity) can also be 
derived from the line intensities. 

The interaction between the ionizing 
source and the gas is compute d a photoion- 
izatio n code (e.g. Cloudy, see iFerland et al. I 
Il998h allowingto constructu numerical mod- 
els of H II regions, including the intensities of 
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the emission lines. Such models can then be 
compared to the observations and if all the ob- 
servables are reproduced one can think that the 
model is close to a good description of the ob- 
served object. One must still be aware that dou- 
ble solution can exist, see Sec. 12. II 

I present here a new database of photoion- 
ization models, which can be used to look for 
models that are reproducing a given observa- 
tion or a given catalog of observations. This 
tool can be understaood as a kind of H II re- 
gions virtual observatory where line intensities 
from millions of models can be mined. 

2. P-space and 0-space 

One can describe a (photoionization) model 
as a link from the parameter-space (P- 
space) to the observable-space (O-space). The 
parameter-space is describing an object in 
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Next generation model 

I — Model fitting an observation 



P-space 0-space 

Model not fitting any observation 

Result of a modei in tfie O-space 




[Nll]/Hbeta 



Fig. 1. O-space and P-space in a simple case where the dimensions of both spaces are only 2. The 2 
parameters in the P-space are Z and U, the 2 observables in the O-space are the line ratios [Nil] and [OIII] 
over Hbeta. 



terms of effective temperature, luminosity, size 
of the nebula, radial density variation, abun- 
dances, presence of dust, etc. This can be seen 
as the set of inputs required to compute the 
model. The object in the observable-space is 
described by the set of the emission line inten- 
sities. This is also the set of outputs of the pho- 
toionization model. 

The dimension of the P-space is the num- 
ber of free parameters needed to describe a 
model, it can easily reach a value of 15 for ID 
models (as when running Cloudy), many more 
for 3D models where the description of the 
density distribution is more complexe (using 
e.g. Cloudy _3D, see lMorissem 2006). The di- 
mension of the O-space is the number of emis- 
sion line intensities that one can obtain from 
the photoionization code. It can be seven hun- 
dreds of lines! But most of these lines are re- 



dundant: their intensities is proportional to an- 
other line, e.g. [OIII]4959 and [OIII]5007 or 
not observed, because of their low signal/noise 
or because no observation is available in the 
corresponding wavelength range for a particu- 
lar object. 

In the O-space we find the results of the 
modeling process (what we classically call the 
models, projections from the P-space into the 
O-space using a code) and also the observa- 
tions of "real" objects. Actually, taking into 
account the error bars around each observed 
value of emission line intensity transform the 
observed objects to an hyper-boxes around the 
observed values (in the O-space). 

Fig.[T]illustrats the relation between the P- 
space and the O-space. The modeling process 
is represented by the link between the 2 spaces. 
A model is actually the projection of a set of 
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parameters (a point in the P-space) into a point 
in the the O-space. 

2.1. Non linearity, degeneracies 

Any point in the P-space transforms into a 
point into the O-space. The function that trans- 
forms a point from P into a point in O is con- 
tinuous, therefore any shape in the P-space also 
transforms into a shape in the O-space. The re- 
lation between the shape in the P-space and the 
corresponding shape in the O-space is far from 
being linear For example, a rectangule in the 
P-space does not transform into a rectangular 
plane in the O-space, but rather into a com- 
plex hyper-shape. This is illustrated by Fig. 2 
in [Stasiriska et al. ( 2006) where a regular grid 
in the P-space (of 2 dimensions U and Z) trans- 
forms into a curved shape into the O-space. 

The reverse is also true: a rectangular shape 
into the O-space is not obtained by a rectan- 
gular shape in the P-space: this is why it is 
not possible to easily obtain the parameters of 
the models that adjust a given observation (See 
sec. 12.21 1. 

In the ca s e illu strated by Fig. 2 in 
IStasinska et alj (l2006h . the problem is even 
worst as the projected shape into the O-space 
of the rectangle from the P-space is an over- 
lapping surface. This leads to a degeneracy, as 
the same point in the O-space is obtained by 2 
different points in the P-space. 

2.2. Fitting an observed object 

The action of fitting an observation by some 
models is finding the models which are 
close to a given observation in the O-space. 
Considering the errors on the observations, this 
means finding the models that fall in the hyper- 
box around the point that represent the object 
in the O-space. In the case illustrated by Fig.[Tl 
the fitting models are falling within the rect- 
angle around the observations. Due to the high 
non-linearity of the transformation between the 
P- and the O-space, there is no simple way to 
go from an observation to the set of physical 
parameters that describe the object. 



There are various ways to find the set of 
values in the P-space that reproduce an ob- 
served object (a point in the O-space, or an hy- 
percube if we take the error bars into account): 

- By hand: running models and figuring out 
what are the effect in O-space of changing 
something in the P-space. 

- By automatic Khi2 method: for example 
Cloudy can optimize a set of parameter to 
fit a set of observations. 

Generally the two methods above lead to a 
definition of the "best" model fitting the obser- 
vations of an object. 

- Regular grids of models: this method can 
be very useful to see the effects of changing 
one parameter on the observables. It gives 
the possibility of finding various models 
that fit the same observation (within the er- 
rors) One major problem is that only a few 
parameters can be changed (5 parameters 
with 7 values each leads to... 80000 mod- 
els!) A second problem is that most of the 
models are totally useless (in the corners 
of the hypercube in the P-space, therefore 
most of the time not corresponding to any 
observations) 

- Irregular grids of models: This is the case 
of a grid that can be adapted to increase 
the density of models in the P-space in 
regions where this is useful. Such an ap- 
proach needs observations to know which 
locus in the P-space is "good" (it falls in a 
"good" locus in the O-space : where there 
is observed objects). For this one can use a 
kind of genetic algorithm, see next section. 

3. A genetic algorithm for the 
definition of new models 

To define a genetic algorithm, we must con- 
sidere two phases: a phase of selection of par- 
ents and a phase of reproduction with random 
evolution, generating children. 

The selection of the parent models is per- 
formed in the O-space, in the hyper-boxes 
around the observations, the sizes of the hy- 
percube being the acceptable error on each 
observable (e.g. emission line intensity). Any 
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model that falls within an hyper-box around an 
observation is a model selected for the repro- 
duction (it is a parent model). A new genera- 
tion of models is generated from the set of par- 
ent models. The values of the parameters for 
the children are determined randomly around 
the values of the parent models, within a given 
range. Each parent will generate a given num- 
ber of children. In the present case, there is no 
"sexual" reproduction in the sense that there 
is only one parent needed to make children (a 
kind of unicellular organism reproduction by 
division and random evolution). This process is 
illustrated in Fig.[Tl where new models are rep- 
resented in the P-space around the parent mod- 
els, which fit an object in the O-space. New 
models in the O-pace can fall around observa- 
tions that were not fitted before, or be closer to 
an O-point (leading to a better fit). 

The sizes of the different boxes in Fig. [T] 
play an important role: if the size of the hyper- 
box in the O-space is small, the number of fit- 
ting models is small, but the quality of their 
fit is good. On the contrary, if the size is big, 
there will be more models fitting the observa- 
tions. Some observations that cannot be fitted 
within a small box can be fitted by models (of 
smaller quality) with a bigger box. 

On the P-space side, the size of the box is 
the range in which the parameters will be ran- 
domly sorted out. A big P-box will allow an 
exploration of the P-space, with a possibility 
of finding models that fit new observations. But 
given that the new parameters can be quite dif- 
ferent from the "working" values, the probabil- 
ity of finding better fit is small. On the contrary, 
defining small P-boxes gives better fits around 
objects already fitted, leading to a densification 
of the models around the observed points in the 
O-space. 

4. The 3MdB 

The Mexican Million Models database is 
a project of a huge photoionization model 
database, where the user can search easily and 
quickly for models that reproduce a given set 
of observations. 

There are more than 15 parameters that can 
be varied to describe a model: 



- The ionizing SED can be described as a 
Planck function (2 parameters: the effective 
temperature and the luminosity), as a stel- 
lar atmosphere model (with various avail- 
able libraries), in this case the stellar metal- 
licity and the surface gravity may also be 
provided. There is also a possibility to de- 
scribe the SED in terms of stellar cl uster, 
with a Starburst99 dLeithereJIC. et al.l) ion- 
izing flux (given an age of the burst) or 
even a description of hundreds of individ- 
ual stars that form the cluster. 

- The ionized gas: the inner radius of the 
nebula, the hydrogen density, the abun- 
dances of the main elements, the presence 
of dust (composition, density), a filling fac- 
tor for the gas. 

Once the model is computed (using 
Cloudy) the output files are read and the en- 
try in the database for the model is completed 
by adding to the parameters the intensities of 
more than 70 emission lines and all the ionic 
fractions and temperatures (integrated on the 
line of sight and on the volume). 

An entry in the 3MdB is: a point in the P- 
space (defined by the values of all the param- 
eters), the corresponding point in the O-space 
(the values of the observables, i.e. line intensi- 
ties), plus a set of other characteristics of the 
models, such as the recombination radius, the 
ionic fractions and temperatures, the mean ion- 
ization parameters, all being parameters that 
can be useful to the user in understanding the 
model. 

The genetic algorithm described in Sec[3] 
is used to compute the values of the param- 
eters for the new generation models. The ob- 
servations that are used for the selection of 
the parent models are from variou s catalogs, 
such a s metal-poor galaxies from I zotov et al.l 
( 20061). or the M3 3 Spitzer observations from 
lRubinetalJ ( l2008h . 

All the models are in a single table in the 
database, whatever the set of observations used 
to select the models: some models computed to 
fit (optical) SDSS data can be useful for fitting 
the (IR) M33 HII regions. 

The database contains 1,350,000 models 
(October 2008). The increasing rate of the 
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database is 350 models/hour. It presently run 
on a 2-double-core AMD 64 bits processors 
computer. 

The data are in MySQL tables, driven by 
IDL routines calling Cloudy, reading the out- 
puts and filling the database. 

There is a queuing system with priorities: a 
set of models can be sent to the queue at any 
moment, the models with higher priorities be- 
ing running before the ones with lower priority. 
This allow the user to quickly run a small grid 
of models while a larger grid with lower prior- 
ity is waiting. 
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5. The future 

5.1. A User-friendly interface 

The 3MdB will be accessible through a user- 
friendly interface in a short future. There will 
be some possibility of selecting the models by 
any criteria, for example by fitting a given ob- 
ject or set of objects, within observational tol- 
erances. The actual time needed to search in 
the whole database for all the models reproduc- 
ing 10 emission Une ratios is only 10 seconds. 



5.2. Virtual Observatory integration 

One of the next evolution of the 3MdB is to in- 
sure the interoperability with the emission line 
databases of Hll regions or galaxies. It will be 
possible to directly search in the 3MdB the list 
of models that reproduce an object from the 
VO space. 
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